Dictionaries and distributions: Combining expert knowledge and large scale textual data content analysis : Distributed dictionary representation.

نویسندگان

Justin Garten

Joe Hoover

Kate M Johnson

Reihane Boghrati

Carol Iskiwitch

Morteza Dehghani

چکیده

Theory-driven text analysis has made extensive use of psychological concept dictionaries, leading to a wide range of important results. These dictionaries have generally been applied through word count methods which have proven to be both simple and effective. In this paper, we introduce Distributed Dictionary Representations (DDR), a method that applies psychological dictionaries using semantic similarity rather than word counts. This allows for the measurement of the similarity between dictionaries and spans of text ranging from complete documents to individual words. We show how DDR enables dictionary authors to place greater emphasis on construct validity without sacrificing linguistic coverage. We further demonstrate the benefits of DDR on two real-world tasks and finally conduct an extensive study of the interaction between dictionary size and task performance. These studies allow us to examine how DDR and word count methods complement one another as tools for applying concept dictionaries and where each is best applied. Finally, we provide references to tools and resources to make this method both available and accessible to a broad psychological audience.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A New Dictionary Construction Method in Sparse Representation Techniques for Target Detection in Hyperspectral Imagery

Hyperspectral data in Remote Sensing which have been gathered with efficient spectral resolution (about 10 nanometer) contain a plethora of spectral bands (roughly 200 bands). Since precious information about the spectral features of target materials can be extracted from these data, they have been used exclusively in hyperspectral target detection. One of the problem associated with the detect...

متن کامل

ILLINOIS-PROFILER: Knowledge Schemas at Scale

In many natural language processing tasks, contextual information from given documents alone is not sufficient to support the desired textual inference. In such cases, background knowledge about certain entities and concepts could be quite helpful. While many knowledge bases (KBs) focus on combining data from existing databases, including dictionaries and other human generated knowledge, we obs...

متن کامل

Creating a Comparative Dictionary of Totonac-Tepehua

We apply algorithms for the identification of cognates and recurrent sound correspondences proposed by Kondrak (2002) to the Totonac-Tepehua family of indigenous languages in Mexico. We show that by combining expert linguistic knowledge with computational analysis, it is possible to quickly identify a large number of cognate sets within the family. Our objective is to provide tools for rapid co...

متن کامل

Using the Textual Content of the LMF-Normalized Dictionaries for Identifying and Linking the Syntactic Behaviors to the Meanings

In this paper we propose an approach for identifying syntactic behaviours related to lexical items and linking them to the meanings. This approach is based on the analysis of the textual content presented in LMF normalized dictionaries by means of Definition and Context classes. The main particularity of these contents is their large availability and their semantically control due to their atta...

متن کامل

Combining Dictionary-Based and Example-Based Methods for Natural Language Analysis

We propose combining dictionary-based and example-based natural language (NL) processing techniques in a framework that we believe will provide substantive enhancements to NL analysis systems. The centerpiece of this framework is a relatively large-scale lexical knowledge base that we have constructed automatically from an online version of Longman's Dictionary of Contemporary English (LDOCE), ...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

Behavior research methods

دوره 50 1 شماره

صفحات -

تاریخ انتشار 2018

Dictionaries and distributions: Combining expert knowledge and large scale textual data content analysis : Distributed dictionary representation.

نویسندگان

چکیده

منابع مشابه

A New Dictionary Construction Method in Sparse Representation Techniques for Target Detection in Hyperspectral Imagery

ILLINOIS-PROFILER: Knowledge Schemas at Scale

Creating a Comparative Dictionary of Totonac-Tepehua

Using the Textual Content of the LMF-Normalized Dictionaries for Identifying and Linking the Syntactic Behaviors to the Meanings

Combining Dictionary-Based and Example-Based Methods for Natural Language Analysis

عنوان ژورنال:

اشتراک گذاری